-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kernfs_memcg: Add helpers to gather memcgroup related data #96
base: main
Are you sure you want to change the base?
Conversation
1909949
to
8c618c9
Compare
I have added other helpers and modified the earlier ones, so that they work with other UEK versions as well |
2d9b918
to
d84a204
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating memcg helpers. I added couple comments. And beside that, please create a bug for it and put the number in the git log.
Signed-off-by: Imran Khan <[email protected]>
Thanks @biger410 for reviewing this. I have addressed your review comments. Could you please have a look and let me know if you have any further feedback |
Add kernfs based helpers to extract memcg related information, like number of active and inactive memcgroups, page cache pages pinning memcgroups etc. Orabug: 37322867 Signed-off-by: Imran Khan <[email protected]>
Orabug: 37322867 Signed-off-by: Imran Khan <[email protected]>
Orabug: 37322867 Signed-off-by: Imran Khan <[email protected]>
bb98c05
to
e0875fa
Compare
The new changes look good to me. One more request is that can we add a corelen module for it? It should run either with -M option or -A option. |
Add corelens module to dump pages that are pinning memcgroups. By default upto 10K such pages are shown. Orabug: 37322867 Signed-off-by: Imran Khan <[email protected]>
I have added corelens module. python3 -m drgn_tools.corelens ~/Local-vmcores/Workqueue-study/UEK-6-rds-issue/VMCORE -d ~/Local-vmcores/Workqueue-study/UEK-6-rds-issue/ page: 0xffffc35c881b0000 cgroup: /user.slice/user-0.slice/session-7128.scope state: CSS_ONLINE|CSS_VISIBLE path: /datastore/oracle_230124_0745EST/diag/asm/cell/scaqat06celadm02/incpkg/pkg_76/seq_1/cell/IPSPKG_20230120060027_COM_1.zip page: 0xffffc35c881b0040 cgroup: /user.slice/user-0.slice/session-7128.scope state: CSS_ONLINE|CSS_VISIBLE path: /datastore/oracle_230124_0745EST/diag/asm/cell/scaqat06celadm02/incpkg/pkg_76/seq_1/cell/IPSPKG_20230120060027_COM_1.zip page: 0xffffc35c881b0080 cgroup: /user.slice/user-0.slice/session-7128.scope state: CSS_ONLINE|CSS_VISIBLE path: /datastore/oracle_230124_0745EST/diag/asm/cell/scaqat06celadm02/incpkg/pkg_76/seq_1/cell/IPSPKG_20230120060027_COM_1.zip page: 0xffffc35c881b00c0 cgroup: /user.slice/user-0.slice/session-7128.scope state: CSS_ONLINE|CSS_VISIBLE path: /datastore/oracle_230124_0745EST/diag/asm/cell/scaqat06celadm02/incpkg/pkg_76/seq_1/cell/IPSPKG_20230120060027_COM_1.zip page: 0xffffc35c881b0100 cgroup: /user.slice/user-0.slice/session-7128.scope state: CSS_ONLINE|CSS_VISIBLE path: /datastore/oracle_230124_0745EST/diag/asm/cell/scaqat06celadm02/incpkg/pkg_76/seq_1/cell/IPSPKG_20230120060027_COM_1.zip page: 0xffffc35c881b0140 cgroup: /user.slice/user-0.slice/session-7128.scope state: CSS_ONLINE|CSS_VISIBLE path: /datastore/oracle_230124_0745EST/diag/asm/cell/scaqat06celadm02/incpkg/pkg_76/seq_1/cell/IPSPKG_20230120060027_COM_1.zip page: 0xffffc35c881b0180 cgroup: /user.slice/user-0.slice/session-7128.scope state: CSS_ONLINE|CSS_VISIBLE path: /datastore/oracle_230124_0745EST/diag/asm/cell/scaqat06celadm02/incpkg/pkg_76/seq_1/cell/IPSPKG_20230120060027_COM_1.zip page: 0xffffc35c881b01c0 cgroup: /user.slice/user-0.slice/session-7128.scope state: CSS_ONLINE|CSS_VISIBLE path: /datastore/oracle_230124_0745EST/diag/asm/cell/scaqat06celadm02/incpkg/pkg_76/seq_1/cell/IPSPKG_20230120060027_COM_1.zip |
Is there a paste error? How could this trigger the corelen cmd?
And with 10000 default, how could user run corelen cmd to dump all pages? |
Yes, it was a copy paste error and missed the -M part . The actual command is: and output is like the one shown below: `imran@imran-metabox:~/oracle-samples-drgn-tools/drgn-tools$ python3 -m drgn_tools.corelens ~/Local-vmcores/Workqueue-study/UEK-6-rds-issue/VMCORE -d ~/Local-vmcores/Workqueue-study/UEK-6-rds-issue/ -M kernfs_memcg page: 0xffffc35c881b0040 cgroup: /user.slice/user-0.slice/session-7128.scope state: CSS_ONLINE|CSS_VISIBLE path: /datastore/oracle_230124_0745EST/diag/asm/cell/scaqat06celadm02/incpkg/pkg_76/seq_1/cell/IPSPKG_20230120060027_COM_1.zip page: 0xffffc35c881b0080 cgroup: /user.slice/user-0.slice/session-7128.scope state: CSS_ONLINE|CSS_VISIBLE path: /datastore/oracle_230124_0745EST/diag/asm/cell/scaqat06celadm02/incpkg/pkg_76/seq_1/cell/IPSPKG_20230120060027_COM_1.zip page: 0xffffc35c881b00c0 cgroup: /user.slice/user-0.slice/session-7128.scope state: CSS_ONLINE|CSS_VISIBLE path: /datastore/oracle_230124_0745EST/diag/asm/cell/scaqat06celadm02/incpkg/pkg_76/seq_1/cell/IPSPKG_20230120060027_COM_1.zip page: 0xffffc35c881b0100 cgroup: /user.slice/user-0.slice/session-7128.scope state: CSS_ONLINE|CSS_VISIBLE path: /datastore/oracle_230124_0745EST/diag/asm/cell/scaqat06celadm02/incpkg/pkg_76/seq_1/cell/IPSPKG_20230120060027_COM_1.zip page: 0xffffc35c881b0140 cgroup: /user.slice/user-0.slice/session-7128.scope state: CSS_ONLINE|CSS_VISIBLE path: /datastore/oracle_230124_0745EST/diag/asm/cell/scaqat06celadm02/incpkg/pkg_76/seq_1/cell/IPSPKG_20230120060027_COM_1.zip page: 0xffffc35c881b0180 cgroup: /user.slice/user-0.slice/session-7128.scope state: CSS_ONLINE|CSS_VISIBLE path: /datastore/oracle_230124_0745EST/diag/asm/cell/scaqat06celadm02/incpkg/pkg_76/seq_1/cell/IPSPKG_20230120060027_COM_1.zip page: 0xffffc35c881b01c0 cgroup: /user.slice/user-0.slice/session-7128.scope state: CSS_ONLINE|CSS_VISIBLE path: /datastore/oracle_230124_0745EST/diag/asm/cell/scaqat06celadm02/incpkg/pkg_76/seq_1/cell/IPSPKG_20230120060027_COM_1.zip ............................................................ Regarding default page count of 10K, I am using this value to make sure we don't end up spending a lot of time while collecting this data. We have seen that scanning all pages can take hours , so my idea here is that we get information of 10K pages and if that does not indicate anything conclusive , we can later scan all pages. Let me know if its sounds okay or using something other than 10K would be more acceptable |
This as of now is just a dump of some of my bespoke debug scripts.